Contents

1 Model comparison

We compare three models:

See vignette ‘ModelOverview’ to inspect the different models.

2 LOO

2.1 B

FALSE 
FALSE Computed from 4000 by 819 log-likelihood matrix
FALSE 
FALSE          Estimate   SE
FALSE elpd_loo  -2713.3 45.4
FALSE p_loo       596.0 14.1
FALSE looic      5426.5 90.8
FALSE ------
FALSE Monte Carlo SE of elpd_loo is NA.
FALSE 
FALSE Pareto k diagnostic values:
FALSE                          Count Pct.    Min. n_eff
FALSE (-Inf, 0.5]   (good)     104   12.7%   377       
FALSE  (0.5, 0.7]   (ok)       168   20.5%   169       
FALSE    (0.7, 1]   (bad)      411   50.2%   15        
FALSE    (1, Inf)   (very bad) 136   16.6%   3         
FALSE See help('pareto-k-diagnostic') for details.

2.2 BB

FALSE 
FALSE Computed from 4000 by 819 log-likelihood matrix
FALSE 
FALSE          Estimate    SE
FALSE elpd_loo  -3341.6  61.9
FALSE p_loo       271.4  13.5
FALSE looic      6683.2 123.7
FALSE ------
FALSE Monte Carlo SE of elpd_loo is NA.
FALSE 
FALSE Pareto k diagnostic values:
FALSE                          Count Pct.    Min. n_eff
FALSE (-Inf, 0.5]   (good)     484   59.1%   321       
FALSE  (0.5, 0.7]   (ok)       212   25.9%   130       
FALSE    (0.7, 1]   (bad)      116   14.2%   25        
FALSE    (1, Inf)   (very bad)   7    0.9%   13        
FALSE See help('pareto-k-diagnostic') for details.

2.3 ZIBB

FALSE 
FALSE Computed from 4000 by 819 log-likelihood matrix
FALSE 
FALSE          Estimate    SE
FALSE elpd_loo  -3340.8  62.2
FALSE p_loo       283.1  14.2
FALSE looic      6681.7 124.3
FALSE ------
FALSE Monte Carlo SE of elpd_loo is NA.
FALSE 
FALSE Pareto k diagnostic values:
FALSE                          Count Pct.    Min. n_eff
FALSE (-Inf, 0.5]   (good)     472   57.6%   519       
FALSE  (0.5, 0.7]   (ok)       222   27.1%   82        
FALSE    (0.7, 1]   (bad)      116   14.2%   19        
FALSE    (1, Inf)   (very bad)   9    1.1%   4         
FALSE See help('pareto-k-diagnostic') for details.

2.4 Compare all three models

FALSE                                          elpd_diff se_diff elpd_loo p_loo  
FALSE loo::loo(loo::extract_log_lik(B$glm))        0.0       0.0 -2713.3    596.0
FALSE loo::loo(loo::extract_log_lik(ZIBB$glm))  -627.6      24.8 -3340.8    283.1
FALSE loo::loo(loo::extract_log_lik(BB$glm))    -628.3      23.8 -3341.6    271.4
FALSE                                          looic  
FALSE loo::loo(loo::extract_log_lik(B$glm))     5426.5
FALSE loo::loo(loo::extract_log_lik(ZIBB$glm))  6681.7
FALSE loo::loo(loo::extract_log_lik(BB$glm))    6683.2

3 Posterior predictive checks

3.1 Prediction of gene usage within repertoires [count]

  • Usage in raw counts
  • Error bars represent 95% HDI

3.2 Prediction of gene usage within repertoires [%]

  • Usage in %
  • Error bars represent 95% HDI

3.3 Prediction error at a repertoire level

  • e[%] = |Yhat[%] - Y[%| or
  • e[raw count] = |Yhat[count] - Y[count]|

3.4 Prediction of overall gene usage [%]

  • Error bars represent 95% HDI

4 Comparison of coefficients for differential gene usage

Comparisons:

Five genes for which the pairwise models inferred most discrepant usage coefficients are annotated.

4.1 Reality Check